nlp_architect.data.sequential_tagging.SequentialTaggingDataset

class nlp_architect.data.sequential_tagging.SequentialTaggingDataset(train_file, test_file, max_sentence_length=30, max_word_length=20, tag_field_no=2)[source]

Sequential tagging dataset loader. Loads train/test files with tabular separation.

Parameters:
  • train_file (str) – path to train file
  • test_file (str) – path to test file
  • max_sentence_length (int, optional) – max sentence length
  • max_word_length (int, optional) – max word length
  • tag_field_no (int, optional) – index of column to use a y-samples
__init__(train_file, test_file, max_sentence_length=30, max_word_length=20, tag_field_no=2)[source]

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__(train_file, test_file[, …]) Initialize self.

Attributes

char_vocab characters vocabulary
char_vocab_size character vocabulary size
test_set Get the test set
train_set Get the train set
word_vocab words vocabulary
word_vocab_size word vocabulary size
y_labels return y labels
char_vocab

characters vocabulary

char_vocab_size

character vocabulary size

test_set

Get the test set

train_set

Get the train set

word_vocab

words vocabulary

word_vocab_size

word vocabulary size

y_labels

return y labels